Nested Loop Sequences: Towards Efficient Loop Structures in Automatic Parallelization
نویسنده
چکیده
An important problem in automatic parallelization of scientiic programs is to generate loops from an algebraic description of the iteration domain. The usual technique is to produce a perfectly nested set of loops, whose bounds consist in maxima and minima of several aane functions. However, perfect loop nests suuer from the run-time overhead of evaluating bound expressions and do not allow to scan non-convex domains eeciently. In this paper we study a candidate loop structure for overcoming these problems. This structure, called nested loop sequence (NLS,) is deened as a sequence of DO loops whose bodies are nonempty sequences of DO loops. We propose an algorithm to compute a NLS scanning a given convex polyhedron, which overcomes the run-time overhead problem. The algorithm produces a loop structure in which the bounds of every loop consist each in a single aane function. S equences imbriqu ees de boucles : vers des structures de boucles eecaces pour la parall elisation automatique R esum e : La restructuration automatique de programmes scientiiques n eces-site la g en eration de nouvelles structures de boucles pour le programme nal. Celle-ci se fait a partir d'une description alg ebrique du domaine d'it eration. Ha-bituellement, le compilateur g en ere un nid de boucles parfaitement imbriqu ees dans lequel les bornes sont des extrema de plusieurs fonctions aanes. Cette approche conduit a des surco^ uts importants a l'ex ecution, et ne permet pas le parcours de domaines d'it eration non convexes. Dans ce document nous etudions une structure de boucles permettant de r esoudre ces deux probl emes, appel ee s equence imbriqu ee de boucles. Nous proposons un algorithme permettant de calculer une telle structure enum erant les points d'un domaine convexe, ce qui permet de r esoudre le premier probl eme : l'algorithme produit une arborescence de boucles dans laquelle chacune des bornes consiste en une seule fonction aane.
منابع مشابه
Affine Transformations for Communication Minimized Parallelization and Locality Optimization of Arbitrarily Nested Loop Sequences
A long running program often spends most of its time in nested loops. The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses for parallel execution. Affine transformations in this model capture a complex sequence of execution-reordering loop transformations that improve performance by parallelization as well as better locality. Although a significant am...
متن کاملAffine Transformations for Communication Minimal Parallelization and Locality Optimization of Arbitrarily Nested Loop Sequences
A long running program often spends most of its time in nested loops. The polyhedral model provides powerful abstractions to optimize loop nests with regular accesses for parallel execution. Affine transformations in this model capture a complex sequence of execution-reordering loop transformations that improve performance by parallelization as well as better locality. Although a significant am...
متن کاملDependency Analysis of For-loop Structures for Automatic Parallelization of C Code
Dependency analysis techniques used for parllelizing compilers can be used to produce coarse grained code for distributed memory systems such as a cluster of workstations. Nested for-loops provide opportunities for this coarse grained parallelization. This paper describes current dependency analysis tests that can be used to identify ways for transforming sequential C code into parallel C code....
متن کاملOn the Parallelization of Loop Nests Containing while Loops
Recently, eeorts have been made to devise automatic methods, based on a mathematical model, for the par-allelization of loop nests with while loops. These methods are extensions of methods for the parallelization of nested for loops.
متن کاملQuantifier Elimination in Automatic Loop Parallelization
We present an application of quantifier elimination techniques in the automatic parallelization of nested loop programs. The technical goal is to simplify affine inequalities whose coefficients may be unevaluated symbolic constants. The values of these so-called structure parameters are determined at run time and reflect the problem size. Our purpose here is to make the research community of qu...
متن کامل